Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report versioning errors in StorageCluster's status #1447

Closed
wants to merge 1 commit into from

Conversation

rexagod
Copy link
Member

@rexagod rexagod commented Dec 21, 2021

Earlier, only the status of the storage cluster was being updated,
because of which, side effects such as version mismatch came in to play.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 10, 2022
Copy link
Contributor

@umangachapagain umangachapagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why we need to do this, but an operator updating it's own CR is probably not a good idea. We need to re-work on the whole version thing. So, putting this on hold while we explore other options.

/hold

@umangachapagain
Copy link
Contributor

It seems late initialization is fine.

Can we patch only the version field from inside version check, instead of updating the entire resource?
If it fails, we'll retry on next reconcile instead of error out.

Changes in this PR can be reverted.
@jarrpa WDYT?

@jarrpa
Copy link
Member

jarrpa commented Feb 1, 2022

I would want to see a selective patch rather than a full update. We run the risk of overriding a change that came in right after the current reconciliation event, and we don't want to touch any other spec.

@rexagod rexagod force-pushed the 1855339 branch 2 times, most recently from ef8fc19 to 55b704d Compare February 22, 2022 13:59
@nb-ohad
Copy link
Contributor

nb-ohad commented Feb 22, 2022

@rexagod @jarrpa @umangachapagain

I think Patch is the wrong approach here as it does not guarantee the replacement of the old status with the new status.
We might start to see the strange status values-based as of merge semantics.

I will vote to stay with update and maybe add retries to the status update

@rexagod
Copy link
Member Author

rexagod commented Feb 22, 2022

/hold

I think I missed Ohad's comment before pushing the latest commit. Nonetheless, I'll put this PR on hold until we reach a common consensus here.

Copy link
Member

@jarrpa jarrpa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took some doing but I think I understand the problem with the Patch() strategy now.

In general a Patch() strategy has its place, and it is not here. Specifically this is because the StorageCluster reconcile loop is designed to collect changes to the Status and push them all at once at the end of the reconcile loop (whether or not reconciliation completed successfully is a different story). As such what Ohad is saying is correct, we shouldn't be "merging" new state information, we should be "replacing" the whole thing since we are taking the role as the source of truth on the current state of the system at that point in time.

Now, it is true that we have a bug in how we deal with the Version field in the spec. But also, the way we're using the version field is completely broken, we should never be changing the spec to begin with. I would rather see the version field be dropped entirely and have the "version" information moved to the Status subresource. That will require a bit of design discussion, but shouldn't be too bad.

All that being said, there is still some worthwhile work being done in this PR:

  • I actually like the idea of having the version string be empty in version.go, so that should be kept.
  • The versionCheck error condition is also a great addition, so keep that as well.

If you want to change this PR to just those two changes, I'm all right with that. But I would rather have a better conversation on whether we can safely stop using the spec.version field once and for all. We'll have to coordinate with the UI team to know when they'd be able to update the Console code for that, but as it stands it's already broken so what more harm could it do? ¯_(ツ)_/¯

@rexagod rexagod changed the title Apply storage cluster changes after reconcilation Report versioning errors in StorageCluster's status Mar 7, 2022
@umangachapagain
Copy link
Contributor

* I actually like the idea of having the version string be empty in version.go, so that should be kept.

We use version info to generate a version metric . As long as this doesn't break, I'm fine with it.

package version
import (
"github.com/red-hat-storage/ocs-operator/version"
)
// GetVersion returns the version of the exporter.
func GetVersion() string {
// version of the Operator
return version.Version
}

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 13, 2022
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 15, 2022
@rexagod
Copy link
Member Author

rexagod commented Mar 15, 2022

I think we'll need to include the version string since ldflags are not in the scope of unit tests, so version.Version will always be "" for them.

https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/red-hat-storage_ocs-operator/1447/pull-ci-red-hat-storage-ocs-operator-main-ocs-operator-ci/1503646574947864576#1:build-log.txt%3A287-292

@rexagod
Copy link
Member Author

rexagod commented Mar 29, 2022

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 29, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 29, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rexagod, umangachapagain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 29, 2022
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 29, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 29, 2022

New changes are detected. LGTM label has been removed.

@rexagod
Copy link
Member Author

rexagod commented Mar 29, 2022

/retest

@rexagod
Copy link
Member Author

rexagod commented Mar 29, 2022

Will /unhold once the e2e tests pass.

@rexagod
Copy link
Member Author

rexagod commented May 31, 2022

/retest
/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 31, 2022
Update StorageCluster status on versioning error. Append an error status
condition to StorageCluster status field when an error arises during `versionCheck`.

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 31, 2022

@rexagod: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ocs-operator-ci dfc7a80 link true /test ocs-operator-ci
ci/prow/verify-latest-csv dfc7a80 link true /test verify-latest-csv
ci/prow/red-hat-storage-ocs-ci-e2e-aws dfc7a80 link false /test red-hat-storage-ocs-ci-e2e-aws

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@rexagod
Copy link
Member Author

rexagod commented Jun 10, 2022

Follow-up PR: #1706 (spec.version deprecation).

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 6, 2022
@openshift-merge-robot
Copy link
Contributor

@rexagod: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@malayparida2000
Copy link
Contributor

@rexagod I think this PR is stale now & no longer required. So can we close this?

@Nikhil-Ladha
Copy link
Member

Looking at the latest bz status, I think we still need this PR.
Also, as the depreciation PR is merged we should be good to test this PR on a nightly/test build, once rebased.

@rexagod
Copy link
Member Author

rexagod commented Jan 22, 2023

Not sure if we still need this, but since I've moved to a different team recently, I'd appreciate if someone with bandwidth can pick this up.

Closing, TIA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants